Toward High-Performance Language-Independent Query-by-Example Spoken Term Detection for MediaEval 2015: Post-Evaluation Analysis
نویسندگان
چکیده
This paper documents the significant components of a state-ofthe-art language-independent query-by-example spoken term detection system designed for the Query by Example Search on Speech Task (QUESST) in MediaEval 2015. We developed exact and partial matching DTW systems, and WFST based symbolic search systems to handle different types of search queries. To handle the noisy and reverberant speech in the task, we trained tokenizers using data augmented with different noise and reverberation conditions. Our postevaluation analysis showed that the phone boundary label provided by the improved tokenizers brings more accurate speech activity detection in DTW systems. We argue that acoustic condition mismatch is possibly a more important factor than language mismatch for obtaining consistent gain from stacked bottleneck features. Our post-evaluation system, involving a smaller number of component systems, can outperform our submitted systems, which performed the best for the task.
منابع مشابه
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
This working paper provides the basic information about experiments conducted on audio documents within the MediaEval 2012 spoken web search evaluation project. The main purpose of these experiments was to build a robust and language independent system for spoken term detection. Therefore we have proposed query-by-example searching system based on the minimum-cost alignment of DTW algorithm and...
متن کاملMediaEval 2013 Spoken Web Search Task: System Performance Measures
This document discusses how to measure system performance in the Spoken Web Search (SWS) task at MediaEval 2013. The discussion is based on different sources, including the NIST 2006 Spoken Term detection (STD) Evaluation Plan [1], the NIST 2010 Speaker Recognition Evaluation (SRE) Plan [2], the description of the scoring criteria applied in the SWS task at Mediaeval 2012 [3], the Albayzin 2012...
متن کاملNTU System at MediaEval 2015: Zero Resource Query by Example Spoken Term Detection using Deep and Recurrent Neural Networks
This note serves as a documentation describing the methods the authors of this paper implemented for the Query by Example Search on Speech Task (QUESST) as a part of MediaEval 2015. In this work, we combined DTW, DNN and RNN in one framework to perform query by example spoken term detection in a zero resource setting.
متن کاملSpeeD @ MediaEval 2015: Multilingual Phone Recognition Approach to Query by Example STD
In this paper, we attempt to solve the Spoken Term Detection (STD) problem for under-resourced languages by a phone recognition approach within the Automatic Speech Recognition (ASR) paradigm, with multilingual acoustic models from six languages (Albanian, Czech, English, Hungarian, Romanian and Russian). The Power Normalized Cepstral Coefficients (PNCC) features are used for improved robustnes...
متن کاملIIIT-H SWS 2013: Gaussian Posteriorgrams of Bottle-Neck Features for Query-by-Example Spoken Term Detection
This paper describes the experiments conducted for spoken web search (SWS) at MediaEval 2013 evaluations. A conventional approach is to train a multi-layer perceptron using high resource languages and then use it in the low resource scenario. However, phone posteriorgrams have been found to under-perform when the language they were trained on differs from the target language. In this paper, we ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016